-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clusterversion: introduce rac2 cluster version gates #131106
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Introduce two new cluster version gates: ``` V24_3_UseRACV2WithV1EntryEncoding V24_3_UseRACV2Full ``` Upon a range leader first encountering `V24_3_UseRACV2WithV1EntryEncoding` via `handleRaftReadyRaftMuLocked`, it will begin a new term using the replication flow control v2 protocol, creating a `RangeController` but continue using the v1 entry encoding and raft still operating in push mode. Upon a range leader first encountering `V24_3_UseRACV2Full`, it will continue using the replication flow control v2 protocol, but will now switch to using the V2 entry encoding. Note that the necessary protocol migration at the leader, (base) => `V24_3_UseRACV2WithV1EntryEncoding` occurs before any other calls in `handleRaftReadyRaftMuLocked`. The two version gates are necessary to ensure there are never v2 encoded entries in the raft log while there is a possibility of a leader running v1. Resolves: cockroachdb#131102 Release note: None
Move `EnabledWhenLeaderLevel` from `replica_rac2` to the parent package `kvflowcontrol` and rename `V2EnabledWhenLeaderLevel` to reflect the move to a shared v1/v2 package. Also move the corresponding function `racV2EnabledWhenLeaderLevel` to `kvflowcontrol`. `GetV2EnabledWhenLeaderLevel` will check if there are testing knob overrides for the enabled level, and if not continue returning `V2NotEnabledWhenLeader`. Some commentary and todos are also left around this function, for when we enable the protocol and separately, pull mode. Part of: cockroachdb#130187 Release note: None
kvoli
force-pushed
the
240920.rac-cluster-version
branch
from
September 20, 2024 17:55
ba6e03a
to
7cf9492
Compare
sumeerbhola
approved these changes
Sep 24, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, 9 of 9 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @pav-kv)
TYFTR |
craig bot
pushed a commit
that referenced
this pull request
Sep 25, 2024
130728: kvserver: add rac2 v1 integration tests r=sumeerbhola a=kvoli 1st commit from #130619. 2nd-3rd commits from #131106. 4th-5th commits from #131107. 6th-7th commits from #131108. 8th commit from #131109. --- Introduce several tests in `flow_control_integration_test.go`, mirroring the existing tests but applied to the replication flow control v2 machinery. The tests largely follow an identical pattern to the existing v1 tests, swapping in rac2 metrics and vtables. The following tests are added: ``` TestFlowControlBasicV2 TestFlowControlRangeSplitMergeV2 TestFlowControlBlockedAdmissionV2 TestFlowControlAdmissionPostSplitMergeV2 TestFlowControlCrashedNodeV2 TestFlowControlRaftSnapshotV2 TestFlowControlRaftMembershipV2 TestFlowControlRaftMembershipRemoveSelfV2 TestFlowControlClassPrioritizationV2 TestFlowControlQuiescedRangeV2 TestFlowControlUnquiescedRangeV2 TestFlowControlTransferLeaseV2 TestFlowControlLeaderNotLeaseholderV2 TestFlowControlGranterAdmitOneByOneV2 ``` These tests all have at least two variants: ``` V2EnabledWhenLeaderV1Encoding V2EnabledWhenLeaderV2Encoding ``` When `V2EnabledWhenLeaderV1Encoding` is run, the tests use a different testdata file, which has a `_v1_encoding` suffix. A separate file is necessary because when the protocol enablement level is `V2EnabledWhenLeaderV1Encoding`, all entries which are subject to admission control are encoded as `raftpb.LowPri`, regardless of their original priority, as we don't want to pay the cost to deserialize the raft admission meta. The v1 encoding variants retain the same comments as the v2 encoding, however any comments referring to regular tokens should be interpreted as elastic tokens instead, due to the above. Two v1 tests are not ported over to v2: ``` TestFlowControlRaftTransportBreak TestFlowControlRaftTransportCulled ``` These omitted tests behave identically to `TestFlowControlCrashedNodeV2` as rac2 is less tightly coupled to the raft transport, instead operating on replication states (e.g., `StateProbe`, `StateReplicate`). --- Add `TestFlowControlV1ToV2Transition`, which ratchets up the enabled version of replication flow control v2: ``` v1 protocol with v1 encoding => v2 protocol with v1 encoding => v2 protocol with v2 encoding ``` The test is structured to issue writes and wait for returned tokens whenever the protocol transitions from v1 to v2, or a leader changes. More specifically, the test takes the following steps: ``` (1) Start n1, n2, n3 with v1 protocol and v1 encoding. (2) Upgrade n1 to v2 protocol with v1 encoding. (3) Transfer the range lease to n2. (4) Upgrade n2 to v2 protocol with v1 encoding. (5) Upgrade n3 to v2 protocol with v1 encoding. (6) Upgrade n1 to v2 protocol with v2 encoding. (7) Transfer the range lease to n1. (8) Upgrade n2,n3 to v2 protocol with v2 encoding. (9) Transfer the range lease to n3. ``` Between each step, we issue writes, (un)block admission and observe the flow control metrics and vtables. Resolves: #130431 Resolves: #129276 Release note: None 131252: roachtest: port decommission/mixed-versions r=srosenberg,DarrylWong a=renatolabs This commit ports the `decommission/mixed-versions` roachtest to use the `mixedversion` framework (instead of the old `newUpgradeTest` API). It also updates `acceptance/decommission-self` since both tests used shared functionality that needed to be updated. Prior to this commit, the acceptance test used the old upgrade test API even though it was not an upgrade test. Fixes: #110531 Fixes: #110530 Release note: None 131364: upgrades: give test an additional core under remote exec r=rail a=rickystewart This has been timing out. Epic: none Release note: None Co-authored-by: Austen McClernon <[email protected]> Co-authored-by: Renato Costa <[email protected]> Co-authored-by: Ricky Stewart <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Introduce two new cluster version gates:
Upon a range leader first encountering
V24_3_UseRACV2WithV1EntryEncoding
viahandleRaftReadyRaftMuLocked
,it will begin a new term using the replication flow control v2 protocol,
creating a
RangeController
but continue using the v1 entry encodingand raft still operating in push mode.
Upon a range leader first encountering
V24_3_UseRACV2Full
, it will continueusing the replication flow control v2 protocol, but will now switch to
using the V2 entry encoding.
Note that the necessary protocol migration at the leader, (base) =>
V24_3_UseRACV2WithV1EntryEncoding
occurs before any other calls inhandleRaftReadyRaftMuLocked
.The two version gates are necessary to ensure there are never v2 encoded
entries in the raft log while there is a possibility of a leader running
v1.
Move
EnabledWhenLeaderLevel
fromreplica_rac2
to the parent packagekvflowcontrol
and renameV2EnabledWhenLeaderLevel
to reflect themove to a shared v1/v2 package.
Also move the corresponding function
racV2EnabledWhenLeaderLevel
tokvflowcontrol
.GetV2EnabledWhenLeaderLevel
will check if there aretesting knob overrides for the enabled level, and if not continue
returning
V2NotEnabledWhenLeader
. Some commentary and todos are alsoleft around this function, for when we enable the protocol and
separately, pull mode.
Resolves: #131102
Release note: None